-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Feature: Introduce __experimentalCreateInterpolateElement #17376
Conversation
* | ||
* @return {string[]} The generated tags for the given string. | ||
*/ | ||
export const getTagsFromString = ( tagString ) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getTagsFromString
method is never exposed to the public, does it need to be exported? I don't see unit tests for it, either. Not that it needs to be unit tested, but I'm trying to understand why it's exposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
export was accidental from the extraction (moved in from another file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed export in a02fcf7
packages/element/src/token-count.js
Outdated
@@ -0,0 +1,25 @@ | |||
let tokenCount = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, this should be a part of createInterpolateElement
- so you don't have to call resetTokenCount
each time but rather treat it as an internal state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya this is another extraction artifact, I'll move into the main file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handled in 8711d13
packages/element/README.md
Outdated
@@ -138,6 +138,39 @@ _Returns_ | |||
|
|||
- `WPElement`: Element. | |||
|
|||
<a name="createInterpolateElement" href="#createInterpolateElement">#</a> **createInterpolateElement** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we expose it as an experimental function and see how it goes before we commit to the final shape?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, ya that sounds like a good approach. See 46c464e.
a02fcf7
to
46c464e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty solid already, especially given some of the tests with nesting of elements!
I left some comments focusing mainly on:
- Simplicity, with maintainability in mind
- The interface itself
* @return {Element} A react element. | ||
*/ | ||
const createInterpolateElement = ( interpolatedString, conversionMap ) => { | ||
keyIndex = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love it if we could avoid keeping this kind of state in the module. I realise, given the kind of data being treated and the synchronous nature of this operation, that the risk of tainting this state is low, but fundamentally keyIndex
is a value we should be able to let recurse without external state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entirety of create-interpolate-element.js
is the module and keyIndex
is external to the specific function but not the module. It was written this way because (at least with the current recursion logic) passing through keyIndex
as an argument won't work due to the nesting handling. The iteration of keyIndex
at the parent level needs to increment if children update it.
What suggestion would you have for an alternative approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was written this way because (at least with the current recursion logic) passing through
keyIndex
as an argument won't work due to the nesting handling.
Yeah, I suspected this would be why.
What suggestion would you have for an alternative approach?
I don't have a concrete approach, although intuitively maybe this would work if keyIndex
weren't sequential, but a vector of sorts to represent branching/nesting — e.g. if, instead of just going from 0
to Inf
, it would support values like '1'
, '1.0.2'
… so that nested calls of recursiveCreateElement
can act without reporting data back to their callers.
But this only an intuition, I haven't tried any of it. And I wouldn't block the PR over it. But does it sound like a viable approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although intuitively maybe this would work if keyIndex weren't sequential, but a vector of sorts to represent branching/nesting
This would be nice as it would connect branches with parents more visibly via the key value, but there would still be the issue of how would the value be bumped for parents (i.e. 1.0 -> 2.0) when there's a new branch? I think I'd run into the same issue that I had with straight incrementation.
One possibility would be to pass an object along (via an argument) instead of a primitive. As a reference it would be mutable and thus parents would always have the latest iterative value. But I feel that from a risk vector, that's not much different than the current implementation.
* If the conversion map is not a valid array or empty then just return the | ||
* element. | ||
*/ | ||
if ( ! Array.isArray( conversionMap ) || ! conversionMap.length ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my other comment about validating conversionMap
in createInterpolateElement
beforehand. With that in place, this check could be reduced to just the length condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reply to your validation comment applies here. I have a feeling I'm not comprehending what you had in mind though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the other comment thread, I think your argument that it would add an unnecessary loop over conversionMap
is valid. In this case here, though, we could check once in createInterpolateElement
that Array.isArray( conversionMap )
and avoid repeating that test in each recursiveCreateElement
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conversionMap
is mutated, so there's still the necessity for verifying that it is valid before continuing to execute.
{}, | ||
recursiveCreateElement( | ||
interpolatedString, | ||
Object.entries( conversionMap ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this has been discussed elsewhere, but I have doubts about an interface that is by nature unordered (an object) which is then consumed sequentially (see instances of conversionMap.shift()
et al.).
As an outside developer, I would expect one of two scenarios:
conversionMap
is an object keyed on search strings, in which case the order of the object's values is irrelevant.conversionMap
is an array of conversion configs, in which case their order needs to follow the interpolated string. A variant of this would be aprintf
-style argument list of( interpolateString, ...conversionConfigs )
:
createInterpolateElement(
'This is a complex string having a %1$s value, with a <a1>nested <em1>%2$s</em1> link</a1> and value: %3$s',
{ value: 'concrete' },
{ tag: 'a', props: {} },
{ tag: 'em', props: {} },
{ value: 'value' },
{ value: <TestComponent /> },
)
The other thing on my mind is whether we should allow numerical tags. Because, if so, one could reasonably expect the equivalence:
createInterpolateElement(
'Hello, %2$s. You have %1$s new message(s).',
'User',
42
) === createElement( Fragment, {}, [
'Hello, ',
42,
'. You have ',
'User',
' new message(s).'
] )
And I do apologise for raising questions on the interface itself at this stage of development!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I do apologise for raising questions on the interface itself at this stage of development!
No apologies needed! I'm happy to get feedback on anything in this pull :)
I don't know if this has been discussed elsewhere, but I have doubts about an interface that is by nature unordered (an object) which is then consumed sequentially (see instances of conversionMap.shift() et al.).
Ya, I agree. I think I reacted to quickly to some feedback @gziolo gave (which doesn't exist now I think because of a rebase?) about the original shape which was something like this:
[
[ 'span%1', { tag: CustomComponent, props: {}, hasChildren: true } ],
[ 'a%1', { tag: 'a', props: { href: url }, hasChildren: true } ],
[ '%1$s', { tag: CustomComponentB, props: {}, hasChildren: false } ],
[ '%2$s', { value: 'custom value' } ],
]
His feedback was to do the object shape like what I have (which I think was great) primarily to promote the necessity for tag identifiers for each element in the map to be unique. I thought I could get away with preserving the existing logic by just converting things to an array, but I totally get what you're saying about the current shape implying that order doesn't matter, when in fact, it does.
I think it's important we preserve the ability to name tags explicitly both to clearly differentiate between them, and also to potentially add the ability to add more context for translators (cc @akirk on this... I think he might be able to elaborate on that point).
So as far as interface, assuming we preserve the existing logic (which imo is a fairly simple straightforward parse avoiding some of the typical problems with nested tags), the following requirements exist on the map:
- identifiers must me unique.
- The map must be in order of tags in the interpolated string.
Additional requirements that I have picked up from other sources are:
- Context must be preserved for translators (i.e. identify the value is a link). Only tokens is not acceptable.
- Having the ability to do not only
<a1>
but also<a link to marketing>
to provide context would be useful.
So with that in mind, how do you feel about an interface that is something like this:
__experimentalCreateInterpolateElement(
'This is a complex string having a <concrete> value, with ' +
'a <a1>nested <em1><value></em1> link</a1> and value: <A Special Component />,
{ value: 'concrete' },
{ tag: 'a', props: {} },
{ tag: 'em', props: {} },
{ value: 'value' },
{ custom: 'A Special Component', value: <TestComponent /> }
);
The following criteria exist for the arguments in this proposal:
- They are in order according to how they appear in the string.
- No need to provide a custom identifier as it will use the string representation of whatever the
tag
orvalue
prop is. However, they simply will be incremented internally for the matches (which means that the interpolation string must have numbers identifying unique tags. We kind of need to keep this in place otherwise we can get into some pretty hairy nesting match expressions for the regex.) - If one wants to be explicit for tag name in the string, they can by simply providing a
custom
prop with the name being used in the string.
Note, this proposal also gets rid of the numerical %1$s
type tags that can get confused with printf
behaviour. Thoughts?
cc @swissspidy or @ocean90 - I'd appreciate your thoughts on this interface as well from the perspective of both the string that translators would see, and the usability of this from the standpoint of code consuming the interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conversionMap
is an object keyed on search strings, in which case the order of the object's values is irrelevant.
I assumed this scenario based on the examples and tests provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't looking at the whole picture here; in particular, where in all this would __
be called. I assume that one should be able to construct an expression tree such as:
createInterpolateElement(
__( 'This is a <span%1>string</span%1> with <a%1>a link</a%1>, a self-closing %1$s component and a plain value: %2$s' ),
{
// conversionMap
}
);
(the above example was taken from the description of #17376)
In other words, createInterpolateElement
should accept a translated string as its first argument. In the course of translation — the sentence structure in the target locale being entirely dependent on the language — we must assume that tokens have been reordered.
With that in mind, only two high-level approaches are possible:
- We go "full
sprintf
" and require all tokens to be of the%1$s
kind, so that then the interface can be order-sensitive:createInterpolateElement( myString, config1, config2, … )
- We go with unique slugs as tokens. This could look like what we have right now (
createInterpolateElement( testString, { em1: { tag: 'em', props: {} }, a1: { tag: 'a', props: {} } } )
but without imposing any particular order on the conversion map. If I may add: since this would imply a requirement of slug unicity, then a proper map (JS object) would definitely be the way to go, because then the data type itself would enforce the requirement (in contrast, a pairs format like[ [ key1, value1 ], [ key2, value2] ]
would require extra logic to check the unicity of the keys).
Whatever the decisions for the interface, the implementation itself will be order-insensitive: internally it must be able to access the conversion configurations non-sequentially.
I think it's important we preserve the ability to name tags explicitly both to clearly differentiate between them, and also to potentially add the ability to add more context for translators
IMO slugs could help avoid human error, both for the developer and the translator, but would do little for context. To provide good context, the usual methods would still apply (translators:
string and _x()
-type functions). The slugs would still have to be explained, so the gain is small. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we must assume that tokens have been reordered.
Ya I concur. It does complicate things (especially for nesting) but it's pretty clear it's a requirement. I'll work on implementing that the next chance I get to work on this pull.
I don't think we should got full sprintf on this because that has been nixayed as an option here (and in other conversations in core-js chat).
So I think we'll have to go with the approach of the custom token route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO slugs could help avoid human error, both for the developer and the translator, but would do little for context. To provide good context, the usual methods would still apply (translators: string and _x()-type functions). The slugs would still have to be explained, so the gain is small. :)
I agree here too, I wasn't suggesting that they'd be used in lieu of _x()
type functions but more as a complement. It's something that was brought up by @akirk in conversations I had with him so I mostly recorded it here for context. I think descriptive slugs also have the potential to make things more confusing for translators because they might see that as something to translate itself! So while we can support it, I think best practices would be to limit custom tags to variations of common elements (<a>
, <a1>
, <span>
, <strong>
, <span>
etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Descriptive placeholders are good for translators. Translation software will protect placeholders and requires them to appear in the translation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #17376 (comment)
getRegEx( tags, searchString, 'children' ) | ||
); | ||
if ( match !== null ) { | ||
conversionConfig.hasChildren = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The risk of actual unexpected mutations is probably low, but I wonder if we should deeply clone the conversion map before calling recursiveCreateElement
. Or carry over the hasChildren
datum by some other means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... the risk will primarily be if the conversion map is used in sharing the conversionMap they provide to createInterpolateElement
with some other logic outside of here. So ya it's low, but does present potential for a gotcha. If we change the interface to something like the interface comment discussion, then I can clone the config before mutating it (or look at doing what you suggest with communicating the hasChildren
type some other way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to leave this as is. We can see how the experiment plays out but configs are intended to be one use and context specific. The only place this mutation would be a problem is if consumers share configurations across localized strings which is highly unlikely (and any use-cases where I might see that happen would be in conditional statements where mutation would not matter).
46c464e
to
1b18b0f
Compare
Thanks for addressing some of the feedback so far, @nerrad! Let me know when you need a new look at this. |
e55ea05
to
b79556b
Compare
The following changes were added in b79556b per the discussion on the interface (I've also updated the original comment for this pull to reflect the changes): Argument interface
InternalsThis turned out to be a bit easier to do than I thought it would be. All I had to do is internally make sure that the conversion map was re-ordered to match the order of the tags in the string. This was accomplished by the new TestingI modified the tests to reflect the interface changes and also to verify that order does not matter (particularly the complex string test for that). Ready for another review and feedback (cc @mcsf) |
testString, | ||
{ someValue: { value: 10 } } | ||
) | ||
).toEqual( <Fragment>{ testString }</Fragment> ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we should be testing here to that we toEqual( testString )
and not have the <Fragment />
Just a heads up, @dmsnell and I collaborated on this (well he wrote the code, I mostly watched ;) ) at WordCamp US so I'll be pushing what he did (with some tweaks by moi) sometime this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads up, @dmsnell and I collaborated on this (well he wrote the code, I mostly watched ;) ) at WordCamp US so I'll be pushing what he did (with some tweaks by moi) sometime this week.
Thanks for the heads-up! Since I had started a new review last week, I might as well send what I had, should any of it still help.
return conversionMap.sort( ( [ tagA ], [ tagB ] ) => { | ||
tagA = `<${ tagA }`; | ||
tagB = `<${ tagB }`; | ||
return interpolatedString.indexOf( tagA ) > interpolatedString.indexOf( tagB ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comparison function should return { -1, 0, +1 }
(see Array#sort).
tagA = `<${ tagA }`; | ||
tagB = `<${ tagB }`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that the important part here is to perform a lexical sort, is it necessary to prepend '<'
?
const reorderMapByElementOrder = ( interpolatedString, conversionMap ) => { | ||
// if length of map is only one then we can just return as is. | ||
if ( conversionMap.length === 1 ) { | ||
return conversionMap; | ||
} | ||
return conversionMap.sort( ( [ tagA ], [ tagB ] ) => { | ||
tagA = `<${ tagA }`; | ||
tagB = `<${ tagB }`; | ||
return interpolatedString.indexOf( tagA ) > interpolatedString.indexOf( tagB ); | ||
} ); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the whole function, and also given the downside that Array#sort is mutating and that we still want to perform a normal sort but by a specific key, I'd replace all this with
import { sortBy } from 'lodash';
sortBy( conversionMap, ( [ tag ] ) => tag )
|
||
if ( ! isValidConversionMap( conversionMap ) ) { | ||
return interpolatedString; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we shouldn't throw an error instead. Same for the following if
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya I wondered about this as well. There's a similar conditional in the new code.
*/ | ||
const isValidConversionMap = ( conversionMap ) => { | ||
return typeof conversionMap === 'object' && | ||
typeof conversionMap.length === 'undefined'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This struck me as a little odd, but then again JS is pretty odd for not having a properly semantic "dictionary" type. ¯_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya I was trying to avoid using lodash in here. Otherwise I'd just use lodash.isPlainObject
(I realize this isn't an exact equivalent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't actually detecting if we have a valid conversion map; it's determining if we're an object without a length property.
isValidConversionMap( new Map() ) // true
isValidConversionMap( {} ) // true
isValidConversionMap( Object.create(null) ) // true
isValidConversionMap( JSON ) // true
If we're going to validate we probably want to make more certain assurances and even give a more specific name to what we're doing. If I read isValid
I don't really know how to know if the function is correct or finished.
const conversionMapError = s => `createInterpolateComponent: second argument should be { [string]: Element } but ${ s }`;
const matchesObjectOfElements = o => {
if ( 'object' !== typeof o ) {
return [ false, conversionMapError( `found ${ typeof o }` ) ];
}
o.getOwnPropertyNames().forEach( name => {
if ( ! isValidElement( o[name] ) ) {
return [ false, conversionMapError( `found key ${ name } with invalid Element of type ${ typeof o[name] }` ) ];
}
} );
return [ true ];
}
const validateConversionMap = map => {
const [ isObjectOfElements, error ] = matchesObjectOfElements( map );
if ( IS_DEBUG && ! isObjectOfElements ) {
throw new TypeError( error );
}
}
const isValidConversionMap = map => {
try {
validateConversionMap( map );
return true;
} catch ( e ) {
return false;
}
}
This is verbose and ugly but it more closely follows the constraints we are imposing on the system - it's verifying that the things given won't break in the code we write.
By splitting the part that throws from the part that checks we can provide different levels of help when things are invalid. It's probably fine to skip returning true
and false
entirely and rely on throwing the errors. If there's any decent fallback behavior to use then we can try and to that with the boolean catch on false
but otherwise I'm a fan of letting it crash as early as possible to make it harder to miss the bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truly valid conversion maps can only be determined when trying to insert the elements. Another approach is to throw when adding the component into our stack or output. That's less clean but more complete. That approach is also complementary with this approach: validate structure up-front and validate data during the loop.
b79556b
to
a822373
Compare
Update: The latest commit (a822373) is a refactor of the parser for the localization string. It is the fruit of some work done at WCUS with @dmsnell to improve the algorithm for parsing the string and converting to a React.element. Some highlights:
const element = __experimentalCreateInterpolateElement(
sprintf( 'This is a <em>string</em> with a %s', 'value' ),
{ 'em': ['em'] }
);
I think we could merge this pull as is to start getting some use out of the function in places it is needed. However, in future iterations and separate pulls, I think we should:
translatedString = <Translate count=2>
<>This is a <em>string</em></>
<>These are some <em>strings</em></>
</Translate>;
// will get transpiled to
translatedString = createInterpolateElement(
_n( 'This is a <em>string</em>', 'These are some <em>strings</em>', count ),
{
em: [ 'em' ]
}
); |
3790223
to
5c16adf
Compare
This code doesn’t cover all cases of garbage input and the test (while passing) gave false confidence in the one garbage input scenario that was covered. Handling all cases of garbage input is a slippery slope covered with barnacles and broken glass.
I opened #18623 to give it a try, it works like a charm 😍 |
FYI @akirk |
When do you think we should make this API stable? Do you expect changes in the future? it seems solid enough. |
If anything, I might like to see if it requires any adaptations for how we're considering usability improvements in #18614, but generally I think it's pretty stable. |
Description
This new function was first introduced as a part of the experimental pull in #16374. This is a proposed initial iteration of a solution for #9846. While I recognize this is not a complete solution, I think it provides an api for working with interpolated elements that also solves the need for i18n in those contexts.
This function:
wp.element.experimentalCreateInterpolateElement
.Introduction of
createInterpolateElement
(initially as__experimentalCreateInterpolateElement
).This function works similarly to
createElement
in exposing an api for creating a WordPress (React) element.It accepts two arguments:
Example:
The function returns this as the equivalent:
This allows for usage immediately wherever interpolation is needed for translated strings.
Some notes:
Tags can be virtually anything, but for the purpose of localization it's recommended to use something that provides context for translators (eg. instead of just
<a>
you could do<a_link_to_a_site>
).Tags must not have spaces.
The conversion map is expected to be an object of configuration elements (order does not need to match the order of tags in the string). Each element in the object is expected to have two values:
WPElement
(eitherJSX
or viawp.element.createElement
)How has this been tested?
This currently has unit tests covering behaviour. Recommendation is to try resolving this pull with this work.
Types of changes
This is a new feature that introduces new functionality.
Checklist: